A Practical Approach for Few Shot Learning with SetFit for Scaling Up Search and Relevance Ranking on a Large Text Database
Fernando Vieira da Silva • Location: TUECHTIG • Back to Haystack EU 2023
SetFit (Sentence Transformer Fine Tuning) is a recently proposed few-shot learning technique that has achieved state-of-the-art results for multiple classification problems in label-scarce settings, even outperforming GPT-3 in many cases. As for learning to rank, SetFit may be especially important when there are fewer training samples available.
In our case study, we collected a small dataset in the legal research domain, consisting of real-world search queries along with relevant and irrelevant results, manually annotated by lawyers or law students.
After that, we trained a model using the SetFit technique and we generated word embeddings for a larger dataset, for semantic searching. We also trained a ranking model using SetFit and compared results with other approaches for the same language and the legal domain.
In this talk, we present SetFit and its ranking application and we discuss the results of our experiments.
Download the Slides Watch the VideoFernando Vieira da Silva
N2VECFernando is the CEO of N2VEC, a startup that develops a search engine API for enterprise documents. He also has a Ph.D. in Computer Science with a focus on Natural Language Processing and experience working with search, relevance ranking, NER and text classification.